NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Testing for reviewer anchoring in peer review: A randomized controlled trial

https://doi.org/10.1371/journal.pone.0301111

Liu, Ryan; Jecmen, Steven; Conitzer, Vincent; Fang, Fei; Shah, Nihar B (November 2024, PLOS ONE)
Leitner, Stephan (Ed.)
ObjectivePeer review frequently follows a process where reviewers first provide initial reviews, authors respond to these reviews, then reviewers update their reviews based on the authors’ response. There is mixed evidence regarding whether this process is useful, including frequent anecdotal complaints that reviewers insufficiently update their scores. In this study, we aim to investigate whether reviewersanchorto their original scores when updating their reviews, which serves as a potential explanation for the lack of updates in reviewer scores. DesignWe design a novel randomized controlled trial to test if reviewers exhibit anchoring. In the experimental condition, participants initially see a flawed version of a paper that is corrected after they submit their initial review, while in the control condition, participants only see the correct version. We take various measures to ensure that in the absence of anchoring, reviewers in the experimental group should revise their scores to be identically distributed to the scores from the control group. Furthermore, we construct the reviewed paper to maximize the difference between the flawed and corrected versions, and employ deception to hide the true experiment purpose. ResultsOur randomized controlled trial consists of 108 researchers as participants. First, we find that our intervention was successful at creating a difference in perceived paper quality between the flawed and corrected versions: Using a permutation test with the Mann-WhitneyUstatistic, we find that the experimental group’s initial scores are lower than the control group’s scores in both the Evaluation category (Vargha-DelaneyA= 0.64,p= 0.0096) and Overall score (A= 0.59,p= 0.058). Next, we test for anchoring by comparing the experimental group’s revised scores with the control group’s scores. We find no significant evidence of anchoring in either the Overall (A= 0.50,p= 0.61) or Evaluation category (A= 0.49,p= 0.61). The Mann-WhitneyUrepresents the number of individual pairwise comparisons across groups in which the value from the specified group is stochastically greater, while the Vargha-DelaneyAis the normalized version in [0, 1].
more » « less
Full Text Available
A randomized controlled trial on anonymizing reviewers to each other in peer review discussions

https://doi.org/10.1371/journal.pone.0315674

Rastogi, Charvi; Song, Xiangchen; Jin, Zhijing; Stelmakh, Ivan; Daumé, Hal; Zhang, Kun; Shah, Nihar B (December 2024, PLOS ONE)
Bailey, Henry Hugh (Ed.)
Many peer-review processes involve reviewers submitting their independent reviews, followed by a discussion between the reviewers of each paper. A common question among policymakers is whether the reviewers of a paper should be anonymous to each other during the discussion. We shed light on this question by conducting a randomized controlled trial at the Conference on Uncertainty in Artificial Intelligence (UAI) 2022 conference where reviewer discussions were conducted over a typed forum. We randomly split the reviewers and papers into two conditions–one with anonymous discussions and the other with non-anonymous discussions. We also conduct an anonymous survey of all reviewers to understand their experience and opinions. We compare the two conditions in terms of the amount of discussion, influence of seniority on the final decisions, politeness, reviewers’ self-reported experiences and preferences. Overall, this experiment finds small, significant differences favoring the anonymous discussion setup based on the evaluation criteria considered in this work.
more » « less
Full Text Available
Assisting Human Decisions in Document Matching

Kim, Joon Sik; Chen, Valerie; Pruthi, Danish; Shah, Nihar B.; Talwalkar, Ameet. (July 2023, Transactions on machine learning research)

Full Text Available
Challenges, experiments, and computational solutions in peer review

https://doi.org/10.1145/3528086

Shah, Nihar B. (June 2022, Communications of the ACM)

Improving the peer review process in a scientific manner shows promise.
more » « less
Full Text Available
Loss Functions, Axioms, and Peer Review

Noothigattu, Ritesh; Shah, Nihar B.; Procaccia, Ariel D. (January 2021, Journal of artificial intelligence research)

Full Text Available
Active ranking from pairwise comparisons and when parametric assumptions do not help

https://doi.org/10.1214/18-AOS1772

Heckel, Reinhard; Shah, Nihar B.; Ramchandran, Kannan; Wainwright, Martin J. (December 2019, The Annals of Statistics)

Full Text Available
Feeling the Bern: Adaptive Estimators for Bernoulli Probabilities of Pairwise Comparisons

https://doi.org/10.1109/TIT.2019.2903249

Shah, Nihar B.; Balakrishnan, Sivaraman; Wainwright, Martin J. (August 2019, IEEE Transactions on Information Theory)

Full Text Available
PeerReview4All: Fair and Accurate Reviewer Assignment in Peer Review

Stelmakh, Ivan; Shah, Nihar B; Singh, Aarti (March 2019, Algorithmic Learning Theory)

We consider the problem of automated assignment of papers to reviewers in conference peer review, with a focus on fairness and statistical accuracy. Our fairness objective is to maximize the review quality of the most disadvantaged paper, in contrast to the popular objective of maximizing the total quality over all papers. We design an assignment algorithm based on an incremental max-flow procedure that we prove is near-optimally fair. Our statistical accuracy objective is to ensure correct recovery of the papers that should be accepted. With a sharp minimax analysis we also prove that our algorithm leads to assignments with strong statistical guarantees both in an objective-score model as well as a novel subjective-score model that we propose in this paper.
more » « less
Full Text Available
Low Permutation-Rank Matrices: Structural Properties and Noisy Completion

https://doi.org/10.1109/ISIT.2018.8437629

Shah, Nihar B.; Balakrishnan, Sivaraman; Wainwright, Martin J. (June 2018, IEEE International Symposium on Information Theory (ISIT))

Full Text Available
Design and analysis of the nips 2016 review process

Shah, Nihar B; Tabibian, Behzad; Muandet, Krikamol; Guyon, Isabelle; Von Luxburg, Ulrike (January 2018, Journal of machine learning research)

Neural Information Processing Systems (NIPS) is a top-tier annual conference in machine learning. The 2016 edition of the conference comprised more than 2,400 paper submissions, 3,000 reviewers, and 8,000 attendees. This represents a growth of nearly 40% in terms of submissions, 96% in terms of reviewers, and over 100% in terms of attendees as compared to the previous year. The massive scale as well as rapid growth of the conference calls for a thorough quality assessment of the peer-review process and novel means of improvement. In this paper, we analyze several aspects of the data collected during the review process, including an experiment investigating the efficacy of collecting ordinal rankings from reviewers. We make a number of key observations, provide suggestions that may be useful for subsequent conferences, and discuss open problems towards the goal of improving peer review.
more » « less
Full Text Available

Search for: All records